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Abstract 

We develop a macro-model of information retrieval process using 
Game Theory as a mathematical theory of conflicts. We represent 
the participants of the Information Retrieval process as a game of 
two abstract players. The first player is the 'intellectual crowd' of 
users of search engines, the second is a community of information 
retrieval systems. In order to apply Game Theory, we treat search log 
data as Nash equilibrium strategies and solve the inverse problem of 
finding appropriate payoff functions. For that, we suggest a particular 
model, which we call Alpha model. Within this model, we suggest a 
method, called shifting, which makes it possible to partially control 
the behavior of massive users. 

This Note is addressed to researchers in both game theory (pro- 
viding a new class of real life problems) and information retrieval, for 
whom we present new techniques to control the IR environment. 



Introduction 

The techniques we present are inspired by the success of macro-approach in 
both natural and social science. In thermodynamics, starting from a chaotic 
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motion of billions of billions of microparticles, we arrive a simple transparent 
strongly predictive theory with few macro-variables, such as temperature, 
pressure, and so on. In models of market behavior the chaotic motion is 
present as well, but there are two definite parties, each consisting of a big 
number of individuals with common interests, whose behavior is not con- 
corded. 

From a global perspective, information retrieval looks similar: there are 
many individual seekers of knowledge, on one side, and a number of knowl- 
edge providers, on the other: each are both chaotic and non-concorded. There 
are two definite parties, whose members have similar interests, and every 
member of each party tends to maximally fulfill his own interests. How 
could a Mathematician help them? At first sight, each party could be sug- 
gested to solve a profit maximization problem. But back in 1928 it was J. von 
Neumann who realized this approach to be inadequate: you can not maxi- 
mize the value you do not know pQ. In fact, the profit gained by each agent 
depends not only on its actions, but also on the activities of its counterpart, 
which are not known. Then the game theory was developed replacing the 
notion of optimality by that of acceptability. Similarly, the crucial point of 
information retrieval, in contrast to data retrieval, is to get some satisfaction 
(feeling of relevance) rather than retrieve something exact. The analogy 
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was a starting point for us to explore applications of game theory to the 
problems of information retrieval. 

The standard problem of game theory is seeking for reasonable (in var- 
ious senses) strategies. When the rules of the game are given, there is a 
vast machinery, which makes it possible to calculate such strategies. In in- 
formation retrieval we have two parties whose interaction is of exactly game 
nature, but the rules of this game are not explicitly formulated. However, 
we may observe the consequence of these rules as users behavior, that is, 
we deal with the inverse problem of game theory, studied by Dragan [2] for 
cooperative games. In this Note we expand it to non-cooperative case. It 
turns out that the solution of the inverse problem is essentially non-unique: 
different rules can produce the same behavior. We suggest a particular class 
of models, called Alpha models describing an idealized search system similar 
to Wolfram Alpha engine. 
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What can search engine managers benefit of these techniques? Game the- 
ory can work out definite recommendations how to control the interaction 
between the parties of the information retrieval process. This sounds unreal- 
istic: can one control massive chaotic behavior? Thermodynamics shows us 
that the answer is yes. We can not control individual molecules, but in order 
to alter their collective behavior we are able to change macroparameters: the 
engine of your car reminds it to you. In our case the payoff functions of the 
Alpha model are just those parameters. 

In Section [1] we introduce (only the necessary) basic notion from game 
theory, in Section [2] we formulate the information retrieval process in terms 
of game theory and formulate our method as the inverse problem in game 
theory. In Section [3] we suggest its particular solution, which we call Alpha 
model as it resembles Wolfram Alpha engine and in Section [4] we suggest a 
method to control massive users' behavior. 



1 Direct problem: classical game theory 

Game theory is a mathematical theory studying conflicts and trade-offs. It 
involves rational participants who follow formal rules. A game is specified by 
its players, players' strategies and players' payoffs. Begin with a well-known 
example (a reformulated Prisonners' dilemma [5]). 

There are two players A and B. The player A can choose color: Red or 
Green , while B chooses direction: Left or Right . The rules of the game 
are specified by the following pair of payoff matrices (Table 1) 
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Table 1: A game with domination, defined by its pair of payoff matrices 
having the following meaning: if A chooses Green and B chooses Right , A 
gains 20 and B gains 17, and so on. 



The Mathematician can predict the outcome of this game provided the play- 
ers are rational, namely, wishing to gain more: the rational player A will 
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necessarily choose Red and B will choose Left . 



However, both players know the payoff matrices, so, being rational, why 
can't they agree for A to choose Green and for B to choose Right ? The 
point is that they are acting independently, which exclude any agreement. 
This kind of games are called non-cooperative and this is the case for the IR 
community. 

The peculiarity of the above mentioned example is that it has a unique 
(and therefore straightforward) solution. However, such kind of examples 
does not describe the generic situation. Now let us consider a more general 
example (Table 2). 
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Table 2: A non-dominating case: two Nash equilibria. 



First note that no player has a dominating strategy here, so the outcome 
of the game is at first glance unpredictable. However the Mathematician 
predicts us the outcome of this game as well. First, we see that both (Green , 
Left ) and (Red , Right ) will not0 be realized by rational players. One of 
the following two pairs (just according to the maritime Rules of the Road) 
will necessary occur: (Red, Left) or (Green, Right). Why so? The 
motivation for a rational player to be abide of certain strategy is that leaving 
it unilaterally reduces his gain: 

f #4(Red,Left) > H A (a, Left) ( 
\ H B (Red, Left) > H B (Left,p) ^ 

where Ha{cx,[3) (Hb((oz, P), resp.) is the gain of A (B, resp.) when A 
chooses strategy a and B chooses (3. The relations ([I]) are the famous Nash 
inequalities. A pair of strategies is said to form the Nash equilibrium, if 

1 How it works: suppose A chooses Green , observes that he gains only 5 and then 
switches to Red , which brings him 10. 
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they satisfy these inequalities. In the above example the pair of strategies 
(Red ,Left ) is Nash equilibrium, but so is the pair (Green , Right ) as well! 
So, what will be the Mathematician's prediction for the outcome of this 
game? He will point out what will not occur and what will take place stably 

Now let us pass to the next example (Table 3), which is generic. 
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Table 3: No Nash equilibria. 



We see that there is no equilibrium pairs of strategies in this game, that 
is, if the players are represented by individuals, the outcome of an instance 
of the game can not be predicted. What can the Mathematician tell us now? 
He will suggest to consider players represented by communities. A choice of 
the strategy by the collective player A is described by the distribution of the 
individuals with respect to the strategies they choose: 

P = G»Red , PGreen ) 

q = (<?Left , <?Right ) ~ 

The gain of the collective players with respect to the chosen pair of strategies 
is the average: 

H A (p,q) = T, a jkPjqk / 3 x 

H B {p,q) = J2 b jkPjqk 
where [ajk], [bjk] are the payoff matrices for the players A and B, respectively. 

The prediction of the outcome of the game is now a pair of distributions 
(p*,q*) obtained from the same Nash inequalities fl3]), but referred now to 
averages. 

H A (p* t q*) > H A (p,qZ) ^ 
H B (p*,q*) > H B (p*,q) 
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The fundamental result of game theory is Nash theorem j3], which states 
that the equilibrium in the sense of (jlj) always exist. Moreover, when the 
number of players is two, the answer can be written explicitly: 

\ q = a 22 -b 12 1 _ 

V ^ 1 011+022— Ol2— 021 

Note that the behavior of the player A is completely determined only by 
the payoff matrix of the player B and vice versa. 



2 Crowd Meets Crowd — Inverse Problem 

In this section we describe our IR macromodel as a non-antagonistic conflict 
of two parties, or, other words, a cooperative game of two players. The 
first player, call it A, asks questions, the second, call it B, provides answers. 
The player A stands for the community of users (intellectual crowd) of IR 
systems, the player B stand for the community of providers of search results 
(which is symmetrically treated as intellectual crowd). 

Each particular strategy cxj of the player A is just typing something in a 
searchbox. Each particular strategy (3^ of the player B is to return a page 
with an answer, which, viewed HTML code, is a string of symbols as 

well. An instance of the game is a pair 

otjftk = (input-string,returned-string) 

which is somehow evaluated by each participant. For example, the payoff 
value H A (aj/3k) for the player A for the pair 

otjfik — { 'accommodation', 'No results found') 

is evidently low. In the meantime we do not dare to ascribe any payoff 
value H B (aj(3k) of this instance for the player B (we do not know providers' 
priorities). In more general situations even the evaluations of the player A is 
not known as well. 

However, numerical payoff values are needed in order to apply game the- 
ory: its basic concept — of Nash equilibrium — is based on comparison of 
instances (jlj). As a matter of fact, the participants of IR process do com- 
pare instances, but they do it qualitatively. But the Mathematician needs 
numbers! What data should he proceed in order to get them? 
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Stability and equilibrium. In a sense this week's World Wide Web is the 
same as it was a week ago, whatever be the variety of different queries and 
answers. What is stable in time is the statistics of instances ctjfik'- things 
frequently asked yesterday repeat today. The Mathematician tells us that 
from a game-theoretic perspective this stability is not surprising: these are 
Nash equilibria which are stable, because leaving them is unfavorable. 

If we had known the payoff functions, we could find the Nash equilibrium. 
But in our situation we know the equilibrium (statistics of instances) and we 
have to find the appropriate payoff functions H A (aj(3k), H B {ctj/3k) in (j3J). 
This is the inverse problem in game theory [2]. The inverse problem has 
multiple solutions: for given frequencies there are many different payoff ma- 
trices yielding the same equilibrium!!. Below, we introduce a specific model, 
called Alpha model with the smallest number of free parameters. 



3 Alpha model 

The raw material for us will be a collection of search strings with appropriate 
frequencies and a collection of returned results with appropriate frequencies 
as well. According to our model, we interpret it as realized equilibrium. 
Now we are about to reconstruct the payoff functions. First, according to 
the remark made above, we assume that the number of different strategies for 
both players is the same. If not, we may reach it by appropriate prepocessing 
of data, indetifying some data strings. 

Note that, given a pair of strategies (p,q), there are (infinitely) many 
different payoff functions, for which this pair of strategies is equilibrium. 
Among all such models, we consider the simplest one, closest to data retrieval. 
For this model, the payoff matrices are diagonal: 
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. , b n are positive numbers. 









(6) 



This feature of this model is that the only valuable answer for question 
a, is (3j with the same index j, other answers ftk for k ^ j are of zero value. 

2 A trivial example of such non-uniqueness is multiplying the payoff matrix by a positive 
number. 
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This looks like Wolfram Alpha search engine, which provides the only answer 
to a query, that is why we call our model Alpha. 

The Nash equilibrium for the game is given by: 

We can check this directly checking Nash inequalities (j4]). It is sufficient [4] 
to check it only for pure strategies 

< (8) 

Recall that we have the inverse problem, that is, we know (p,q). Its 
solution is 

a b . . 

aj = — ; o fc = — (9) 

for any fixed positive numbers a, b. The obtained result shows us that: 

• The less frequent is a instance, the higher is its value. 

• The value of a question is determined by the frequency of the reply, 
and vice versa, the value of a reply is determined by the frequency of 
the question. 

The first statement means that within this model frequently asked questions 
have low value for the provider B, and, vice versa, rarely delivered answers 
are of high value for the user A. 

The magic of Nash theory is captured in the second statement. It means 
that the behavior of player A is completely determined only by the payoff 
matrix of player B. In other words, the popularity (=frequency) of users' 
questions depends on priorities of the answering side rather than on their 
own priorities. 



4 Shifting of users' behavior 

So far, we have suggested a quantitative model of IR process. The aim of 
this model is not just to describe, but also to give some means of control 
to the overall process. There are two parties involved, each having its own 



8 



interests. Let us consider what could the provider B do in order to increase 
its gain. 

At first sight, the strategy q should be changed, but the power of Nash 
theory is that the answer is immediate: it does not make sense, any unilateral 
deviation from the equilibrium is unfavorable for B. The player B can not 
directly, by ordering, control the strategy p of player A, nor its payoff matrix. 
So, the only thing B can do is to change its own interests: what remains 
under control of B, is its own payoff matrix. How it works? 

A simple suggestion is to multiply all the elements of B by, say, 1957. 
This suggestion does not affect, as it follows from ([7j), the strategy of player 
A: it is similar to recalculating your wealth from euro to Italian liras: you 
may feel happy, but your wealth will not grow. So far, we have to accept 
a normalization condition for the bonuses of B in order to make them 
scale- invariant. Let us suppose their total amount 23 to be fixed: 

^6 fc = <B = const (10) 

k 

As it was shown in previous section, the strategy of A depends only on 
the payoffs of B. Hence, changing the matrix B will affect the behavior of 
its counterpart A. Furthermore, the statistics of instances will change and, 
therefore, the average gain of B will change. Let us first calculate how the 
average gain H B of B depends on the parameters of its payoff matrix ([6]): 

H B (p,q) = J2 b iPM (11) 

3 

For any strategies Pj, qj. Within our model we know, however, that in 
equilibrium pj — ^ ([9]), therefore the optimal average gain is: 

H B (p,q)=J2^ = b (12) 

3 

The value of the multiple b can now be derived from (Q and the condition 
Y^,Pk — 1; therefore the optimal gain of the player B reads: 
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Now let us explore how the optimal gain H B changes under small varia- 
tions dbk of the parameters of the Alpha model. It follows from the normal- 
ization condition fflUj) that 

X>k = (14) 
and calculate the gradient of the optimal gain H B : 




The variations dbk are obtained from the gradient V,t-ff by requiring the 
conditions (fl4l to be satisfied: 

^ = pI- 1 -H$ ( 16 ) 

i 

which is unnormalized Yule's characteristic [5], reflecting the diversity of the 
variety of queries. 



The shifting. Now suppose we are in a position to make small changes, of 
the magnitude e, of the payoff function of the Alpha Provider. How should 
we apply them in order to make the gain of B maximally increase? The 
answer is given by the formula ( fl6l) . according to which the Alpha Provider 
has to do the following: 

• Find out the relative frequencies of users queries a^. 

• Calculate the average of their squares w = - v\ 

• Slightly re-evaluate the instances placing more bonuses on queries, 
whose frequencies are above the threshold value w, taking them from 
rarely asked questions, whose frequencies are below w. 

As a result, the equilibrium will shift, the frequencies of users' requests will 
adjust accordingly and the Alpha Provider will increase his gain, as it follows 
from (fTTT) by 

6b = ej^^q, (17) 

3 
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Conclusions 



So far, we have described the process of Information retrieval as a non- 
antagonistic conflict between two parties: Users and providers. The mathe- 
matical model of such conflict is a bimatrix cooperative game. Starting from 
the assumption that de facto search log statistics is the Nash equilibrium of 
certain game, we provide a method of calculating the parameters (Q of this 
game, thus solving the appropriate inverse problem. 

A significant, somewhat counter-intuitive consequence of Nash theory is 
that in this class of games the equilibrium, i. e. stable, behavior of the User is 
completely determined only by the distribution of priorities of the Provider. 
From this, we infer suggestions for the provider how to affect the behavior 
of massive User. 
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